Fast fourier transform processor

ABSTRACT

GROUPS FOR CONSECUTIVE SAMPLES FROM TWO SEPARATE CHANNELS ARE SIMULTANEOUSLY PROCESSED IN A CASCADE FAST FOURIER TRANSFORM PROCESSOR BY APPROPRIATELY CONNECTING SELECTED SWITCHES AND DELAYS TO THE COMPUTERS OF THE PROCESSOR.

SRi

Ullllcu ULILD l. \yllm Richard A. Smith Morristown. NJ.

July 1. 1968 June 28, 1971 Bell Telephone Laboratories, Incorporated Murray Hill, NJ.

Inventor Appl. No. Filed Patented Assignee FAST FOURIER TRANSFORM PROCESSOR 10 Claims, 3 Drawing Figs.

US. Cl .1 235/156, 324/77 int. Cl G06f 7/38 Field of Search 235/152,

X56; 307/29, US; 340/167 (B), 147 (C), 147 (Sync), 172.50; 324/77 [56] References Cited UNITED STATES PATENTS 3,150,374 9/1964 Sunstein et al 340/147 FOREIGN PATENTS 1,045,203 10/1966 Great Britain 307 39 OTHER REFERENCES Cooley An Algorithm For the Machine Calculation of Complex Fourier Series" MATH. OF COMP. Vol. 19, pp. 297 301 Apr. 1965 Primary Examiner-Malcolm A. Morrison Assistant ExaminerDavid H. Malzahn Attorneys-R. J. Guenther and William L. Keefauver ABSTRACT: Groups of consecutive samples from two separate channels are simultaneously processed in a cascade fast Fourier transform processor by appropriately connecting selected switches and delays to the computers of the processor.

1 COMPU TE j coupurm PATENTEU JUHQBIQH COMPU TE R 3 COMPU 75/? k SHEET 1 OF 2 FIG. 2

e (o e w) 4| 2..., X :rJK e z T" 40 INVENTO/P R. A. SMITH ATTORNEY PATEN'IHI .IIIII26 III'II SHEET 2 OF 2 FIG. .3

DOWN

A I 2 3 4 5 6 7 6 l 2 3 4 5 6 7 6 I 2 7 6 B I 2345676I 23456 345676 c I 2 3 4 I 2 3 4 I 2 3 4 I 2 3 4 I 2 3 4 D l 2 3 4 I 2 3 4 I 2 3 4 I 2 3 4 I 2 3 4 DOWN F I 2 3 4 l 2 3 4 I 2 3 4 l 2 3 4 I 2 3 4 I I 256I 256 I 25 6| 2 56 I 256 J I 2 5 6 I 2 5 6 I 2 5 6 l 2 5 6 I 2 5 6 UP w 63 7 DOWN L L- 4 L I 2 5 6 I 2 5 6 I 2 5 6 I 2 5 6 I 2 5 6 0 I 3 5 7 I 3 5 7 I 3 5 7 I 3 5 7 l 3 5 7 P I 357 I 357 I 35 7 I 357 I 357 R I 3 5 7 l 3 5 7 I 3 5 7 I 3 5 7 I 3 5 7 FAST FOURIER TRANSFORM PROCESSOR BACKGROUND OF THE INVENTION This invention relates to data processing and, in particular, to the simultaneous calculation of the Fourier transform or the inverse Fourier transform of two selected sequences of data.

J. W. Cooley and J. W. Tukey in a paper entitled An Algorithm for the Machine Calculation of Complex Fourier Series" published Apr. l965 in Mathematics of Computation, Vol. 19, pages 297-301, outlined an efficient computation procedure for calculating the Fourier transform and the inverse Fourier transform of signal segments represented by sequences of samples. Known as the Cooley-Tukey algorithm, this procedure has led to the development of several special purpose computers. For example. U.S.Pat. application Ser. No. 605.791, filed Dec. 29, 1966, by G D. Bergland and R. lilahn, and assigned to Bell Telephone Laboratories, theassignee of this invention, and patent application Ser. No. 605,768, filed Dec. 29, 1966, by M. J. Gilmartin and R. R. Shively now U.S. Pat. No. 3,5l7,l73, and also assigned to Bell Telephone Laboratories,'diselose several efficient computational systems which take advantage of certain computational simplicities of the Cooley-Tukey algorithm. In particular, the Bergland-Klahn application discloses a processor capable of calculating in real time the Fourier transforms of selected contiguous segments ofasignal. I V

SUMMARY OF THE INVENTION This invention provides yet another implementation ofthe Cooley-Tukey algorithm. Accordingto' this invention, by addata as in the prior art.

ding selected switches and delays to a sequence of efficient ring or recursive" equations. The first recursive equation describes how each of a set of samples -either real or. complex-is to be operated upon and combined to yield a first set of new data. This first set of data, in turn, is operated upon as required by a second recursive equation to yield a second set of data. This second set of data, in turn, is operated upon in a manner described by a third recursive equation to yield a third set of data. The number of recursive equations in the set depends on the number of samples to be processed, For example, if the number of samples to be processed; For example, if the number of samples N equals r'", r'and'm both being integers, m recursive equations exist and m recursive operations must be carried out. Each recursive operation consists of N similar calculations-including one multiplication and two algebraic summations-each calculation using rpieces of data produced by the previous recursive operation. In calculating the Fourier transform of a set of real samples using the Cooley-Tukey algorithm, the final set of recursive operations yields the amplitudes and phases of the harmonically related frequency components representing the samples. Alternativerecursive operations yields the Fourier series representation of the processed samples.

Of significance, in the binary implementation of the fast Fourier transform -that is, when r=2-half of the N multiplications carried out in each recursive operation are redundant, differing from the remaining calculations only by a negative sign. This invention takes advantage of this redundancy to reduce computational effort and increase operating efficiency. i

The processor of this invention, known as a cascade fast recursive operation of the Cooley-Tukey algorithm according to this invention, two independent sequences of input data either consisting of, or derived from, two independent sets of samples, are selectively switched from one input lead of the appropriate computer to the other input lead so as always to provide at the same time to the computer two pieces of data from the same sequence of input data for calculation. Data on one of the input leads and on one of the output leads are selectively delayed.

Then, in accordance with this invention, each pair of input data samples the computer is used to calculate two new pieces of data for use in the next recursive operation required by the Cooley-Tukey algorithm, rather than just one "new piece of data. In essence, N pieces of. data associated with a given signal segment require only N/2 calculations to yield N new pieces of data. If each calculation is made in one sample period, the computer in each recursive operation is used only half the time occupied by each incoming signal segment.-

Therefore the switches connecting, the two independent channelsof input data to the computer-areoperated so as to ensure, together with the delays, that datafrom one of the channels are available at eachcomputer half the time occupied by the signal. segment on that channel while data from the other channel are available at the computer the remaining time. V

This allows each computer to process twice as many pieces of This invention will bem ore fully understood in light of the following detailed description taken "together with the drawings. v

BRIEF DESCRIPTION or THEFDRAWINGS FIG. 1 is a schematic diagram of one embodiment of a cascade fast Fourier transform processor using the principles of this invention;

FIG 2 is a schematic block diagram of a computer for, use in I this invention; and

ly, in calculating the inverse Fourier transform, the final set of V Fourier transform processor, is in one embodiment composed FIG: 3 relatesthe operation of the switches in FIG, 1 to thedistribution of dataat selected points in FIG. 1. v DETAILED DESCRIPTION The fast Fourier transform, hereinafter called the FFT, results from the use of the Cooley-Tuk ey algorithm to calculate the Fourier transform or the-inverse Fourier. transform of discrete samples. The theory on which the FFTI is based is fully described inthe above-cited paper by Cooleyv and Tukey and,

for example, in the June 196.7,IEEE Transactions Orr/{adio- And Electroqcoustics, Vol AU-l 5.

In -brief,.the Cooley-Tukey algorithm de l l... l In this case m recursive operations must be performed to produce the Fourier transform or the inv'erse Fourier transform of these N-input samplesuThe first recursive operation required by the Cooley-Tukey algorithm input samples and is 'writtenas xii'jm m-w 5.1M)

operates on the N X0). kin--1 niw oil. lim-2- 0) i where j and k assume the binaryvaluesO and I, and m The general recursive operation in the Cooley-Tukey algorithm for i=2 is written as wormworm-. 2 v (3) fines ,a set ofirecursive operationstobe carried out on a set of inputdata to yield a desired transformation. If the number of samples N .being, processed equals 2'", then, for the purposes of the algorithm, these samples are represented by the symbols X,,(0( )...0)...X

where p-m-l, m2,...,0.

In implementing the Cooley-Tukey algorithm for m-3, first, Equation (1) is applied to the N-2 or eight signal samples X,,(000) through X (lll) to yield 11 first set of new data X,(000) through X,(lll). Then Equation (2), the second recursive equation of the algorithm is applied to XAOOO) through X,( I l l to produce a second set of new date X,(000) through X,( l l l). Finally, Equation (3), with p-S, is applied to X,(000) through X,( l i l) to produce a third and final set of data X,(000) through X,(i l l) representing the required transfon'nation of the samples X,,(000) through X,,( l l l FIG. I shows one embodiment of this invention. For convenience in describing the operation of this embodiment it will be assumed that the Fourier transforms of consecutive groups of eight signal samples derived from two independent signals are being calculated, rather than the inverse Fourier transforms of these samples. Of course the inverse Fourier transform of groups of samples derived from two independent signals can also be calculated using the principles of this invention.

To described in detail the first recursive operation of the Cooiey-Tukey algorithm, Equation (1) is expanded as shown below into Equations (la) through lit) for the case where the number N of input samples being processed is eight (i.e., m=3.

x,(111)=x.,(011 +x, 111)e" (1h) The arguments of the X tenns are in binary notation. For convenience in describing the operation of this invention with aid of FIG. 3, the consecutive groups of eight samples on input leads A and B (FIG. 1) will be referred to in FIG. 3 by decimal numbers I 2...8 and l 2...8, respectively. The groups of eight samples on lead B of FIG. I are shown on row B of FIG. 3 offset by four data samples from the groups of eight input samples on lead A of FIG. 1. While the decimal numbers I 2 3...8 are repeated periodically on both rows A and B of FIG. 3, it should be understood that these numbers do not represent the same group of eight samples periodically repeated, but rather represent consecutive groups of eight samples from two simultaneously occurring signals. Furthermore, as new data is produced by each recursive operation of the Cooley-Tukey algorithm, this new data will adapt the decimal notation of the input data or samples or samples from which it is derived.

In FIG. 1 the leads labeled A and B carry samples derived from two signals. Lead A supplied a stream of consecutive samples derived from a first signal while lead B supplied a stream of samples derived from a second signal. These samples are, for example, derived by well-known sampling apparatus from two signals, stored in an auxiliary store (not shown), and then presented to the cascade FFT processor shown in FIG. 1 at a rate compatible with the computing speed of this processor.

Initially, switch 30 is in contact with node 11 and switch 31 is in contact with node 13. As a result, the stream of samples on lead A, denoted on row A ofFICi. 3 by the decimal notation l 2...8 l 2...8 is initially transmitted to 4-sample delay 5. Likewise. the samples on lead B, shown on row B of FIG. 3 by the underlined decimal notation l 2...8 i 2...t'i are sent by switch 3i directly to computer I. The step-like function shown at the top of FIG. 3 denotes the positions with time, measured in sample periods, of switches 30 and 31, which together comprise switch unit SI. Because switches 30 and 31 are initially up," that is, in contact with nodes 11 and 13, this step function is initially up. But in accordance with this invention, switches 30 and 31, for the case where consecutive groups of eight samples of input data are analyzed, are changed from contact with nodes It and I3, respectively, to contact with nodes 12 and 14, respectively, after the fourth sample on lead A has entered delay 5.

Thus, the filth sample on lead A, rather than going to delay 5, goes directly through switch 31 to computer I. The samples on lead B then go through switch 30 to delay 5 and fill delay 5 while computer I is processing pairs of samples from lead A. Thus delay 5 is always occupied completely with samples from either lead A or lead B or both.

As a result, computer I now receives simultaneously the first and the fifth samples on lead A, denoted by X,,(000) and X,,( I00) in Equations la) and (is) (or by the decimal notation 1 and 5 in FIG. 3). Computer I thus begins computing the new data required by Equations la) through (lh). As shown by Equation (la), the first and fifth samples on lead A, denoted by X (000) and X,,(l00), respectively, are used to compute the first piece of new data X,(O00) and the fifth piece of new data X,( I00). (Note that in this specification the numerical designation of a piece of data is determined by its argument in the appropriate recursive equation, and not by the order in which it is calculated.)

However, as shown by Equation Ie), the fifth piece of new data is a function ofthe product XfllQOle X I00) which differs from the piece of data X,,( I00) required by Equation (la) only by a minus sign. Therefore computer 1 calculates simultaneously X (00O) and X,( from the same two pieces of input data by using a subtracting rather than an ad ding circuit to produce X,(l00). As shown in FIG. 3 on the lines labeled F and G, line F contains the first piece of new data X,(000) (labeled 1 on line F) produced by the first recursive operation and line G contains the fifth piece of data X,( I00) (labeled 5 on line G) produced by the first recursive operation.

Likewise, the second and the sixth samples X (00l) and X,,( I01) are, as shown by Equations (lb) and (if), operated on in computer i to produce the second and the sixth piece of data X,(00l) and X,( IOl respectively, (labeled 2 and 6 on lines F and G, respectively, of FIG. 3) generated by the first recursive operation of the Cooley-Tukey algorithm. FIG. 3 shows that these two'new pieces of data likewise appear simultaneously on leads F and G of computer I (FIG. 1).

In a similar way the third and seventh samples on lead A, X (0l0) and X,,( I 10), respectively (shown on lines D and E of FIG. 3 to enter computer 1 simultaneously), produce the third and seventh samples X,(0l0) and X,( 1 l0) (denoted 3 and 7 on leads F'and G of FIG. 3, respectively). Similarly the fourth and eighth input samples on lead A, X;,(0I l) and X.,( l l l produce the fourth and eighth samples, X,(0l I) and X.(] l l), on leads F and G, respectively.

Now that computer I has completed the first recursive operation specified by Equations (la) through (1h) on the first group of eight samples from lead A, first group of eight samples from lead B is similarly processed by computer 1 as described above. Meanwhile, switches 30 and 31 have returned to the up" position and delay 5 is being loaded with the first four samples from the second group of eight samples on lead A in preparation for a similar processing by computer 1 of this group of samples.

Now the output data on lead G enters Z-sample delay 6. The output data on lead F and from delay 6 are then distributed by switch unit S2 comprising switches 32 and 33, to the proper input leads of computer 2, preparatory to the carrying out, by this computer, of the calculations required by the second recursive equation of the Cooiey-Tukey algorithm.

According to the second recursive equation of the Cooleyl'ukey algorithm, reproduced in detail below us Equations (20) through (2I1),dute produced by the first recursive operation is operated upon to produce a second set of data.

As shown by Equation (20). the first and third pieces of 5 data, X,(00O) and X,(0l0), respectively. produced by the first recursive operation, are used to produce the first and third new pieces of data, X (00O) and X4010), respectively, generated by the second recursive operation. However, the output data from computer I initially consists of the first and fifth pieces of data, X,(OO0) and X,( 100), respectively, generated by the first recursive operation. Thus, the data on lead F is transmitted through switch 32 to 2-sample delay 7.

As shown in FIG. 3 by the step function labeled S2, switches 32 and 33 remain in the up position contacting nodes and 17 for the time taken by computer 1 to produce two new pieces of data on each of its output leads. Then switches 32 and 33 are moved to the down" position in contact with nodes 16 and 18, respectively. Immediately after this move, leads F and G contain X,(0l0) and X (ll0), the third and seventh pieces of data respectively, produced by the first recursive operation. The third piece of data X,(010) is sent by means of switch 33 directly to the input of computer 2 and arrivcs at lead K of computer 2 just as X,(OO0), the first piece of data from computer 1 arrives at lead .1 of computer 2. As shown by Equations (2a) and (2c), these two pieces of data, X,(O0O) and X,(Ol0), are then used to calculate in computer 2 the first and third pieces of data X (000) and X (OlO), respectively, (shown on lines L and M of FIG. 3 as 1 and 3, respectively) produced by the second recursive operation of the Cooley-Tukey algorithm.

As shown by Equation (21;), the second and the fourth pieces of data from computer 1, X,(O0l) and X,(0l l respectively, are used to calculate the second and the fourth pieces of data X (00l) and X (0l 1) produced by the second recursive operation of the Cooley-Tukey algorithm. The simultaneous time relationship between these four pieces of data is shown on lines J and K, and L and M of FIG. 3.

The operation of computer 2 on the remaining pieces of data from computer 1 is essentially the same. Delays 6 and 7 operate in conjunction with switches 32 and 33in such a manner that each pair of input data to computer 2 is operated upon in computer 2 to produce an identically designated pair of output data from computer 2. Lines 1 and K, and L and M of FIG. 3 show this correspondence.

Switching unit S3 comprises switches 34 and 35, initially in contact with nodes 19 and 21. As shown by the step function labeled S3 in FIG. 3, switches 34 and 35 alternate at the sampling rate from the up" to the down" to the up to the down...etc. position. Thus when the first and third pieces of data X (000) and X (0lO) (labeled respectively I and 3 on lines L and M of FIG. 3) produced by the second recursive operation leave computer 2, the data X (00O) is passed through switch 34 to l-sample delay 9. On the other hand, the third piece of data X (0l0) is stored in l-sample delay 8 for the period of one sample. Then switches 34 and 35 are moved to the down" position in contact with nodes and 22, respectively. Thus, when the third piece of data X (0l0) leaves delay 8 it is transmitted through switch 34 to l-sample delay 9. Meanwhile, the first piece of data X (000) from computer 2 on lead L has left delay 9 and is on input lead P to computer 3. At the same time the second piece of data X (0O1) (labeled 2 on line L of FIG. 3) to leave computer 2 on lead L is transmitted through switch 35 to input lead Q of computer 3. Thus, the first two pieces of data produced by the second recursive operation, to be operated on by computer 3 in the third recursive operation of the Cooley-Tukey algorithm are X (000) and X (OOl Equations (3a) through (3h) show the operations required by the third recursive equation of the Cooley-Tukey algorithm on the data produced by computer 2.

Now as shown by Equations (3a) and (3b) the data X (000) and X (O01) produced by computer 2 (labeled 1 and 2 on lines P and Q of FIG. 3) are used in computer 3 to calculate the first two pieces of data X;,(000) and X;,(00l) (labeled 1 and 2 on lines R and S of FIG. 3) produced by the third recursive operation of the Cooley-Tukey algorithm.

The generation of the remaining third through eighth pieces of data X (01O) through X 1 1 l) by computer 3 from the corresponding pieces of data produced by computer 2 can be traced by following these eight pieces of data through the relative data locations shown in FIG. 3.

Thus, after the first and second pieces of data generated by computer 2 have been used in computer 3 to generate the first and second pieces of data produced by the third recursive operation of the Cooley-Tukey algorithm, the third piece of data produced in computer 2 leaves l-bit delay 9. This piece of data is passed into computer 3 on lead P just as the fourth piece of data produced by computer 2 passed from delay 8 through switch 35 to input lead Q of computer 3. Computer 3 then produces from these two pieces of data the third and fourth pieces of data generated by the third recursive operation of the Cooley-Tukey algorithm.

The above-described process continues indefinitely so long as input data continues to enter the processor on leads A and B FIG. 2 shows an arithmetic unit appropriate for use in any of I the computers of this invention. For convenience in explanation, it will be assumed that this arithmetic unit constitutes computer 1. The input and output leads are thus labeled D, E, F and G to correspond to the leads on computer 1. The input data on lead D is multiplied by e in product generator 40 where 9 is determined by the appropriate recursive equation. The resulting product is added to the input data to computer 1 on lead E in network 42 and is subtracted from this input signal in network 41. The resulting output signals on leads F and G represent the appropriate pieces of data specified by the recursive equations of the Cooley-Tukey algorithm.

From the above description of the operation of this invention, it is apparent that it is the arrangement and operation of the switches and delays shown in FIG. 1 in cooperation with the efficient arithmetic unit of FIG. 2 or its equivalent that 1. Apparatus for simultaneously calculating the Fast Fourier Transform of two sequences of data which comprises:

a plurality of serially connected processing units, each unit including;

a computer containing two input and two output leads, for computing simultaneously two new pieces of data from each pair of input data;

switching means for shunting pairs of input data selected or derived from one or the other of said two sequences of data, to the input leads of said computer, according to a predetermined strategy;

a first selected delay in the first input lead of said computer;

and

a second selected delay in the second output lead of said computer.

2. Apparatus as in claim 1 in which said second selected delay is one half said first selected delay except in the last serially connected processing unit in which said second selected delay is zero.

3. Apparatus as in claim I in which said switching means comprise two switches for lending selected data to said first selected delay while simultaneously sending other selected data to the second input lead to said computer, in accordance with a predetermined strategy.

4. Apparatus for generating signals representing Fourier series coeflicients corresponding to each of a plurality, m of input sequences of data signals, each of said sequences comprising N data signals, said apparatus comprising a plurality, K, of processing units. the ith of said processing units, i=4, 2,....K comprising:

1. a computer comprising m input leads and m output leads for forming at each of said m output leads a set of N/m output data signals for each of m sets of N signals selectively applied at said m input leads, said computer thereby forming a set of N output signals for each corresponding set of N signals applied at said input leads, and

2. means for distributing subsets of each of the m sets of N output signals from the (i-l)th unit to selected ones of said m input leads, said m input sequences comprising the m sets of signals applied at the input leads of the first unit.

5. The apparatus of claim 4 wherein said computer comprises means for forming as each of said set of N output signals N Fourier series coetficients based on selected ones of said corresponding set of N signals applied at said input leads.

6. The apparatus of claim 5 wherein said means for distributing subsets comprises a store capable of storing at least a subset of said N/m data signals, and switching means for selectively applying data signals from one or more of said output leads from said (M )th unit to selected ones of said m input leads at said lth unit.

7. The apparatus of claim 6 wherein said switching means comprises (i) means for applying a first subset of said N/m data signals from a given output lead of said (l-l)th unit to said store. (2) means for subsequently reading said first subset of said N/m data signals from said store and applying them to a first predetermined set of said input leads, and (3) means for reading a second subset of said N/m data signals from said given output lead of said immediately preceding unit and applying them to a second predetermined set of said input leads.

8. Apparatus according to claim 7 wherein said means for forming said output signals comprises (1) a source of trigonometric function signals, (2) means for forming product signals corresponding to the product of selected ones of said trigonometric function signals and selected ones of said signals applied at said input leads, and (3) means for forming signals representing the signed sum of selected ones of said product signals and selected ones of said signals applied at said input leads.

9. The apparatus according to claim 6. wherein m=2, N=2, and said subsets each comprise N/2 data signals.

10. The apparatus according to claim 8 wherein m=2, N=2" and said subsets each comprise N/2 data signals. 

